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Field of the Invention 

The present invention relates generally to methods and compositions for the detection 
and/or treatment of breast cancer. More specifically, the present invention relates to breast 
cancer-associated proteins and nucleic acids encoding such proteins which represent cellular 
markers for breast cancer detection, and molecular targets for breast cancer therapy. 

Background of the Invention 

Breast cancer is a leading cause of death in women. While the pathogenesis of breast 
cancer is unclear, transformation of normal breast epithelium to a malignant phenotype may be 
the resuh of genetic factors, especially in women under 30 (Miki et al (1994) Science 266: 
66-71). However, it is likely that other, non-genetic factors also have a significant effect on the 
etiology of the disease. Regardless of its origin, breast cancer morbidity increases significantly if 
it is not detected early in its progression. Thus, considerable effort has focused on the 
elucidation of early cellular events surroimding transformation in breast tissue. Such effort has 
led to the identification of several potential breast cancer markers. For example, alleles of the 
BRCAl and BRCA2 genes have been linked to hereditary and early-onset breast cancer (Wooster 
et al (1994) Science 265: 2088-2090). The wild-type BRCAl allele encodes a tumor suppressor 
protein. Deletions and/or other alterations in that allele have been linked to transformation of 
breast epithelium. Accordingly, detection of mutated BRCAl alleles or their gene products has 
been proposed as a means for detecting breast, as well as ovarian, cancers (Miki et al, supra). 
However, BRCAl is limited as a cancer marker because BRCAl mutations fail to account for the 
majority of breast cancers (Ford et al (1995) British 1 Cancer 72: 805-812). Similarly, the 
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BRCA2 gene, which has been Unked to forms of hereditary breast cancer, accounts for only a 
small portion of total breast cancer cases (Ford et al, supra). 

Several other genes have been linked to breast cancer and may serve as markers for the 
disease, either directly or via their gene products. Such potential markers include the TP53 gene 
and its gene product, the p53 tumor suppressor protein (Malkin et al (1990) Science 250: 1233- 
1238), The loss of heterozygosity in genes such as the ataxia telangiectasia gene has also been 
linked to a high risk of developing breast cancer (Swift et al (1991) N. Engl. J. Med 325: 1 83 1- 
1836), A problem associated with many of the markers proposed to date is that the oncogenic 
phenotype is often the result of a gene deletion, thus requiring detection of the absence of the 
wild-type form as a predictor of transformation. 

There is, therefore, a need in the art for specific, reliable markers that are differentially 
expressed in normal and transformed breast tissue and that may be usefiil in the diagnosis of 
breast cancer, in the prediction of its onset or the treatment of breast cancer. Such markers and 
methods for their use are provided herein. 

Summary of the Invention 

The invention provides a variety of methods and compositions for detecting the presence 
of breast cancer in a mammal, for example, a human, and for treating breast cancer in a mammal 
diagnosed with the disease. The invention is based, in part, upon the discovery of a family of 
proteins each member of which is detectable at a higher concentration in serum from a mammal, 
for example, a human, with breast cancer relative to serum from a normal mammal, that is, a 
manamal without breast cancer. Accordingly, these proteins, as well as nucleic acid sequences 
encoding such proteins, or sequences complementary thereto, can be used as breast cancer 
markers usefiil in diagnosing breast cancer, monitoring the efficacy of a breast cancer therapy 
and/or as targets of such a therapy. 

In one aspect, the invention provides isolated breast cancer-associated protein markers. 
The protein markers are characterized as being detectable at a higher concentration in the serum 
of a mammal, specifically, a human, with breast cancer than in serum of a mammal without 
breast cancer. 
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One marker protein is further characterized in that it has a molecular weight of about 16 
kD, and fails to bind in a detectable amount to an anion exchange resin in the presence of 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
17 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 25 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a WCX-2 SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
30 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 25 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a WCX-2 SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
35 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 25 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a WCX-2 SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
20 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 50 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
24 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes fi-om the anion exchange resin in the presence of 50 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 
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Another marker protein is further characterized in that it has a molecular weight of about 
28 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 50 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. Microsequence analysis has identified the marker protein to be a protein known in the art 
as small nuclear ribonucleoprotein B" (Habets et al (1 987) Proc Natl Acad Sci, USA 84, 
2421-2425), the amino acid sequence of which is identified hereinbelow as SEQ ID NO: 5. 

Another marker protein is further characterized in that it has a molecular weight of about 
35 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 50 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 

Another marker protein is further characterized in that it has a molecular weight of about 
35 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 50 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
1 8 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 100 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a WCX-2 SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
71 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 100 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a WCX-2 SELDI 
chip. Microsequence analysis has identified the marker protein to be a protein known in the art 
as, or related to, the 64 kD subunit of cleavage stimulating factor (Takagaki et aL (1987) Proc 
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Natl Acad Sci, USA 89, 1403-1407), the amino acid sequence of which is identified 
hereinbelow as SEQ ID NO: 22 and SEQ ID NO: 23.. 

Another marker protein is further characterized in that it has a molecular weight of about 
12 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 150 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a SAX-2 SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
42 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 200 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 

Another marker protein is fiirther characterized in that it has a molecular weight of about 
56 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 200 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a nickel SELDI 
chip. 

Another marker protein is further characterized in that it has a molecular weight of about 
35 kD, binds to an anion exchange resin in the presence of 50 mM sodium phosphate, pH 7.0, 
and elutes from the anion exchange resin in the presence of 400 mM sodium chloride in 50 mM 
sodium phosphate, pH 7.0. This marker protein also has a binding affinity to a copper SELDI 
chip. 

Furthermore, the aforementioned breast cancer-associated proteins are further 
characterized as being non-inmiunoglobulin and/or non-albumin proteins. Furthermore, the 
breast cancer-associated proteins may further define an antigenic region or epitope that may bind 
specifically to a binding moiety, for example, an antibody, for example, a monoclonal or a 
polyclonal antibody, an antibody fragment thereof, or a biosynthetic antibody binding site 
directed against the antigenic region or epitope. In addition, the invention enables one skilled in 
the art to isolate nucleic acids encoding the aforementioned breast cancer-associated proteins or 
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nucleic acids capable of hybridizing under specific hybridization conditions to a nucleic acid 
encoding the breast cancer-associated proteins. Furthermore, the skilled artisan may produce 
nucleic acid sequences encoding the entire isolated marker protein, or fragments thereof, using 
methods currently available in the art (see, for example, Sambrook et aL, eds. (1989) "Molecular 
Cloning: A Laboratory Manual," Cold Spring Harbor Press). For example, the breast cancer- 
associated protein of the invention, when isolated, can be sequenced using conventional peptide 
sequencing protocols. Based on the peptide sequence, it is possible to produce oligonucleotide 
hybridization probes useful in screening a cDNA library. The cDNA library may then be 
screened with the resultant oligonucleotide to isolate full or partial length cDNA sequences 
encoding the isolated protein. 

In another aspect, the invention provides a variety of methods, for example, protein or 
nucleic acid-based methods, for detecting the presence of breast cancer in a manmial. The 
methods of the invention may be performed on any relevant tissue or body fluid sample. For 
example, methods of the invention may be performed on breast tissue, more preferably breast 
biopsy tissue. Alternatively, the methods of the invention may be performed on a human body 
fluid sample selected from the group consisting of: blood; serum; plasma; fecal matter; urine; 
vaginal secretion; spinal fluid; saliva; ascitic fluid; peritoneal fluid; sputum; and breast exudate. 
It is contemplated, however, that the methods of the invention also may be useful in detecting 
metastasized breast cancer cells in other tissue or body fluid samples. Detection of breast cancer 
can be accomplished using any one of a number of assay methods well knovm and used in the art. 

In one aspect, the method of diagnosing cancer in an individual comprises contacting a 
sample from the individual with a first binding moiety that binds specifically to a breast-cancer 
associated protein to produce a first binding moiety-cancer-associated protein complex. The first 
binding moiety is capable of binding specifically to at least one of the breast cancer associated 
marker proteins identified hereinabove to produce a complex. Thereafter the presence and/or 
amount of marker protein in the complex can then be detected, for example, via the first binding 
moiety if labeled with a detectable moiety, for example, a radioactive or fluorescent label, or a 
second binding moiety labeled vsdth a detectable moiety that binds specifically to the first binding 
moiety using conventional methodologies well knovm in the art. The presence or amount of the 
marker protein can thus be indicative of the presence of breast cancer in the individual. For 
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example, the amount of marker protein in the sample may be compared against a threshold value 
previously calibrated to indicate the presence or absence of breast cancer, wherein the amount of 
the complex in the sample relative to the threshold value can be indicative of the presence or 
absence of cancer in the individual. Although such a method can be performed on tissue, for 
example, breast tissue, or a body fluid, for example, serum, a body fluid currently is the preferred 
test sample. 

Detection of the aforementioned nucleic acid molecules can also serve as an indicator of 
the presence of breast cancer and/or metastasized breast cancer in an individual. Accordingly, in 
another aspect, the invention provides another method for detecting breast cancer in a human. 
The method comprises the step of detecting the presence of a nucleic acid molecule in a tissue or 
body fluid sample thereby to indicate the presence of breast cancer in an individual. The nucleic 
acid molecule is selected from the group consisting of (i) a nucleic acid molecule comprising a 
sequence capable of recognizing and being specifically bound by a breast cancer-associated 
protein, and (ii) a nucleic acid molecule comprising a sequence encoding at least a portion of one 
or more of the breast cancer-associated proteins identified herein. 

In one embodiment, the method comprises exposing a sample from the individual under 
specific hybridization conditions to a nucleic acid probe, for example, greater than about 10 and 
more preferably greater than 15 nucleotides in length, capable of hybridizing to a target nucleic 
acid encoding one of the breast cancer-associated proteins identified herein to produce a duplex. 
Thereafter, the presence of the duplex can be detected using a variety of detection methods 
known and used in the art. It is contemplated that the target nucleic acid may be amplified, for 
example, via conventional polymerase chain reaction (PGR) or reverse transcriptase polymerase 
chain reaction (RT-PCR) methodologies, prior to hybridization with the nucleic acid probe. 

In one embodiment, the target nucleic acid (for example, a messenger RNA (mRNA) 
molecule), is greater than 15 nucleotides, more preferably greater than 50 nucleotides, and most 
preferably greater than 100 nucleotides in length and encodes an amino acid sequence present in 
one of the breast cancer-associated proteins identified herein. Such a target mRNA may then be 
detected, for example, by Northern blot analysis by reacting the sample with a labeled 
hybridization probe, for example, a ^^P labeled oligonucleotide probe, capable of hybridizing 
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specifically with at least a portion of the nucleic acid molecule encoding the marker protein. 
Detection of a nucleic acid molecule either encoding a breast cancer-associated protein or 
capable of being specifically bound by a breast cancer-associated protein, can thus serve as an 
indicator of the presence of a breast cancer in the individual being tested. 

In another aspect, the invention provides a kit for detecting the presence of breast cancer 
or for evaluating the efficacy of a therapeutic treatment of a breast cancer. Such kits may 
comprise, in combination, (i) a receptacle for receiving a human tissue or body fluid sample from 
the individual to be tested, (ii) a binding partner which binds specifically either to an epitope on a 
breast cancer-associated marker protein or a nucleic acid sequence encoding at least a portion of 
the breast cancer-associated protein or the nucleic acid sequence encoding at least a portion of 
the breast cancer-associated protein, and (iii) a reference sample. In one embodiment, the 
f 2 reference sample may comprise a negative and/or positive control. In that embodiment, the 
^3 negative control would be indicative of a normal breast cell type and the positive control would 
i 3 be indicative of breast cancer. 

In another aspect, the invention provides methods and compositions for treating breast 
cancer. In one aspect the invention provides proteins or nucleobase-containing sequences useful 
: in the treatment of breast cancer. The therapeutic protein could be, for example, a binding 
M moiety, for example, an antibody, for example, a monoclonal antibody, an antigenic binding 

s - 

[3 fragment thereof, or a biosynthetic antibody binding site capable of binding specifically to a 
breast cancer-associated protein identified herein. The method comprises the step of 
administering to a patient with breast cancer, a therapeutically-effective amount of a compound, 
preferably an antibody, and most preferably a monoclonal antibody, which binds specifically to a 
target breast cancer-associated protein thereby to inactivate or reduce the biological activity of 
the protein. The target protein may be any of the breast cancer-associated proteins identified 
herein. Similarly, it is contemplated that the compound may comprise a small molecule, for 
example, a small organic molecule, which inhibits or reduces the biological activity of the target 
breast cancer-associated protein. 

In another aspect, the invention provides another method for treating breast cancer. The 
method comprises the step of administering to a patient diagnosed as having breast cancer, a 
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therapeutically-effective amount of a compound which reduces in vivo the expression of a target 
breast cancer-associated protein thereby to reduce in vivo the expression of the target protein. In 
a preferred embodiment, the compound is a nucleobase containing sequence, for example, an 
anti-sense nucleic acid sequence or a peptidyl nucleic acid (PNA) capable of binding to and 
reducing the expression (for example, transcription or translation) of a nucleic acid encoding at 
least a portion of at least one of the breast cancer-associated proteins identified herein. After 
administration, the anti-sense nucleic acid sequence or the anti-sense PNA molecule binds to the 
nucleic acid sequences encoding, at least in part, the target protein thereby to reduce in vivo 
expression of the target breast cancer-associated protein. 

Thus, the invention provides a wide range of methods and compositions for detecting and 
treating breast cancer in an individual. Specifically, the invention provides breast cancer- 
associated proteins, which permit specific and early, preferably before metastases occur, 
detection of breast cancer in an individual. In addition, the invention provides kits usefiil in the 
detection of breast cancer in an individual. In addition, the invention provides methods utilizing 
the breast cancer-associated proteins as targets and indicators, for treating breast cancers and for 
monitoring of the efficacy of such a treatment. These and other numerous additional aspects and 
advantages of the invention will become apparent upon consideration of the following figures, 
detailed description, and claims which follow. 

Description of the Drawings 

The invention can be more completely understood with reference to the following 
drawings, in which: 

Figures 1 A-IC are spectra resulting from the characterization via mass spectrometry of 28 
kD proteins subjected to trypsin digestion and eluted fi*om a polyacrylamide gel. Figure 1 A is a 
spectrum of the heaviest 28 kD protein isolated from the gel. Figure IB is a spectrum of the 
median 28 kD protein isolated from the gel, and Figure IC is a spectrum of the lightest 28 kD 
protein isolated from the gel. 



ATTY.D^mE 



ZICATION 
ETNO. MTP-024 



Detailed Description of the Invention. 

The present invention provides methods and compositions for the detection and treatment 
of breast cancer. The invention is based, in part, upon the discovery of breast cancer-associated 
proteins which generally are present at detectably higher levels in serum of humans with breast 
cancer relative to serum of humans without breast cancer. 

The breast cancer-associated proteins or nucleic acids encoding such proteins may act as 
markers useful in the detection of breast cancer or as targets for therapy of breast cancer. For 
example, it is contemplated that the marker proteins and binding moieties, for example, 
antibodies that bind to the marker proteins or nucleic acid probes which hybridize to nucleic acid 
sequences encoding the marker proteins, may be used to detect the presence of breast cancer in 
an individual. Furthermore, it is contemplated that the skilled artisan may produce novel 
therapeutics for treating breast cancer which include, for example: antibodies which can be 
administered to an individual that bind to and reduce or eliminate the biological activity of the 
target protein in vivo; nucleic acid or peptidyl nucleic acid sequences which hybridize with genes 
or gene transcripts encoding the target proteins, thereby to reduce expression of the target 
proteins in vivo; or small molecules, for example, organic molecules which interact with the 
target proteins or other cellular moieties, for example, receptors for the target proteins, thereby to 
reduce or eliminate biological activity of the target proteins. 

Set forth below are methods for isolating breast cancer-associated proteins, methods for 
detecting breast cancer using breast cancer-associated proteins as markers, and methods for 
treating individuals afflicted with breast cancer using breast cancer-associated proteins as targets 
for cancer therapy. 

L Methods for Detectins Breast Cancer-Associated Marker Proteins. 

Marker proteins of the invention, as disclosed herein, are identified by comparing the 
protein composition of serum of a human diagnosed with breast cancer with the protein 
composition of serum of a human free of breast cancer. As used herein, the term "breast cancer- 
associated protein" is understood to mean any protein which is detectable at a higher level in a 
tissue or body fluid of an individual diagnosed with breast cancer relative to a corresponding 
tissue or body fluid of an individual free of breast cancer and includes species and allelic variants 
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thereof and fragments thereof. As used herein, the term "breast cancer" is understood to mean 
any cancer or cancerous lesion associated with breast tissue or breast tissue cells and can include 
precursors to breast cancer, for example, atypical ductal hyperplasia or non-atypical hyperplasia. 
It is not necessary that the marker protein or target molecule be unique to a breast cancer cell or 
body fluid of an individual afflicted with breast cancer; rather the marker protein or target 
molecule should have a signal to noise ratio high enough to discriminate between samples 
originating from a breast cancer tissue or body fluid and samples originating from normal breast 
tissue or body fluid. 

As used herein, a "portion" or a "fragment" of a protein or of an amino acid sequence 
denotes a contiguous peptide comprising, in sequence, at least ten amino acids from the protein 
or amino acid sequence {e.g. amino acids 1-10, 34-43, or 127-136 of the protein or sequence). 
Preferably, the peptide comprises, in sequence, at least twenty amino acids from the protein or 
amino acid sequence. More preferably, the peptide comprises, in sequence, at least forty amino 
acids from the protein or amino acid sequence. 

The breast cancer-associated marker proteins of the invention were identified by 
comparing the proteins present in the serum of individuals with breast cancer to the proteins 
present in the serum of individuals without breast cancer. Albumin and immunoglobulin 
proteins were removed from the serum, and the proteins were separated into twelve fractions by 
anion exchange chromatography. Briefly, the proteins were loaded on a strong anion exchange 
colunm in the presence of 50 mM sodium phosphate, pH 7.0, and eluted with a stepwise gradient 
of sodium chloride in 50 mM sodium phosphate, pH 7.0. The resulting twelve fractions include 
a flow-through fraction, a fraction eluting in 25 mM sodium chloride, a 50 mM fraction, a 75 
mM fraction, a 100 mM fraction, a 125 mM fraction, a 150 mM fraction, a 200 mM fraction, a 
250 mM fraction, a 300 mM fraction, a 400 mM fraction, and a 2 M fraction. 

Each fraction was analyzed by SELDI (surface-enhanced laser desorption and ionization) 
mass spectrometry. Samples from each of the twelve fractions were applied to one of four 
different SELDI chip surfaces. A copper or nickel SELDI surface can be generated by adding a 
copper or nickel salt solution to a chip comprising ethylenediaminetriacetic acid. Other SELDI 
chip surfaces include: WCX-2 which comprises carboxylate moieties, and SAX-2 which 
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comprises quartemary ammonium moieties. The breast cancer-associated proteins of the 
invention can therefore be characterized by their increased presence in serum of individuals 
having breast cancer relative to individuals without breast cancer, their molecular weight, 
binding and elution characteristics on an anion exchange resin, and their affinity to a particular 
SELDI chip. For example, as used herein, the term "affinity" to a particular SELDI chip is 
understood to mean that the breast cancer-associated proteins of the invention bind preferentially 
to one type of SELDI chip (e.g., copper SELDI chip) relative to one or more of the other SELDI 
chips {e.g., the nickel, SAX-2 and WCX-2 chips) disclosed herein. As discussed in detail in 
Example 1, comparison of the sera from diseased and healthy individuals revealed a number of 
proteins frequently present at detectable levels in the sera of diseased individuals, but 
infrequently present at comparable levels in the sera of healthy individuals. 

Once the breast cancer-associated proteins have been identified by mass spectroscopy, the 
identified proteins can be isolated by standard protein isolation methodologies and sequenced 
using protein sequencing technologies known and used in the art. See, for example. Examples 5 
and 6. Once the amino acid sequences are identified then nucleic acids encoding the marker 
proteins or portions thereof can be identified using conventional recombinant DNA 
methodologies. See, for example, Sambrook et al eds. (1989) "Molecular Cloning: A 
Laboratory Manual", Cold Spring Harbor Press. For example, an isolated breast cancer- 
associated protein can be sequenced using conventional peptide sequencing protocols, and the 
oligonucleotide hybridization probes designed for sequencing a cDNA library. The cDNA 
library may then be screened with the resultant hybridization probes to isolate full length or 
partial length cDNA sequences encoding the isolated marker proteins. 

Marker proteins useful in the present invention encompass not only the particular 
sequences identified herein but also allelic variants thereof and related proteins that also function 
as marker proteins. Thus, for example, sequences that result from alternative splice forms, post- 
translational modification, or gene duplication are each encompassed by the present invention. 
Species variants are also encompassed by this invention where the patient is a non-human 
mammal. Other homologous proteins that may function as marker proteins are also envisioned. 
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Preferably, variant sequences are at least 80% similar or 70% identical, more preferably at least 
90% similar or 80% identical, and most preferably 95% similar or 90% identical to at least a 
portion of one of the sequences disclosed herein. 

To determine whether a candidate peptide region has the requisite percentage similarity or 
identity to a reference polypeptide or peptide oligomer, the candidate amino acid sequence and 
the reference amino acid sequence are first aligned using the dynamic programming algorithm 
described in Smith and Waterman (1981), J. MoL Biol. 147:195-197, in combination with the 
BLOSUM62 substitution matrix described in Figure 2 of Henikoff and Henikoff (1992), "Amino 
acid substitution matrices from protein blocks", PNAS (1992 Nov), 89:10915-10919. For the 
present invention, an appropriate value for the gap insertion penalty is -12, and an appropriate 
value for the gap extension penalty is -4. Computer programs performing alignments using the 
algorithm of Smith- Waterman and the BLOSUM62 matrix, such as the GCG program suite 
(Oxford Molecular Group, Oxford, England), are commercially available and widely used by 
those skilled in the art. 

Once the alignment between the candidate and reference sequence is made, a percent 
similarity score may be calculated. The individual amino acids of each sequence are compared 
sequentially according to their similarity to each other. If the value in the BLOSUM62 matrix 
corresponding to the two aligned amino acids is zero or a negative number, the pairwise 
similarity score is zero; otherwise the pairwise similarity score is 1.0. The raw similarity score is 
the sum of the pairwise similarity scores of the aligned amino acids. The raw score is then 
normalized by dividing it by the nimiber of amino acids in the smaller of the candidate or 
reference sequences. The normalized raw score is the percent similarity. Alternatively, to 
calculate a percent identity, the aligned amino acids of each sequence are again compared 
sequentially. If the amino acids are non-identical, the pairwise identity score is zero; otherwise 
the pairwise identity score is 1.0. The raw identity score is the sum of the identical aligned 
amino acids. The raw score is then normalized by dividing it by the number of amino acids in 
the smaller of the candidate or reference sequences. The normalized raw score is the percent 
identity. Insertions and deletions are ignored for the piuposes of calculating percent similarity 
and identity. Accordingly, gap penalties are not used in this calculation, although they are used 
in the initial alignment. 
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In all instances, variants of the naturally-occurring sequences, as described above, must 
be tested for their function as marker proteins. Specifically, their presence or absence in a 
particular form or in a particular biological compartment must be indicative of the presence or 
absence of cancer in an individual. This routine experimentation can be carried out by the 
methods described hereinbelow or by other methods known in the art. 

Marker proteins in a sample of tissue or body fluid may be detected via binding assays, 
wherein a binding partner for the marker protein is introduced into a sample suspected of 
containing the marker protein. In such an assay, the binding partner may be detectably labeled 
as, for example, with a radioisotopic or fluorescent marker. Labeled antibodies may be used in a 
similar manner in order to isolate selected marker proteins. Nucleic acids encoding marker 
proteins may be detected using nucleic acid probes having a sequence complementary to at least 
a portion of the sequence encoding the marker protein. Techniques such as PGR and, in 
particular, reverse transcriptase PGR, are useful means for isolating nucleic acids encoding a 
marker protein. The examples which follow provide details of the isolation and characterization 
of breast cancer-associated proteins and methods for their use in the detection and treatment of 
breast cancer. 

2. Detection of Breast Cancer 

Once breast cancer-associated proteins have been identified, the proteins or nucleic acids 
encoding the proteins may be used as markers to determine whether an individual has breast 
cancer and, if so, suitable detection methods can be used to monitor the status of the disease. 

Using the marker proteins or nucleic acids encoding the proteins, the skilled artisan can 
produce a variety of detection methods for detecting breast cancer in a human. The methods 
typically comprise the steps of detecting, by some means, the presence of one or more breast 
cancer-associated proteins or nucleic acids encoding such proteins in a tissue or body fluid 
sample of the human. The accuracy and/or reliability of the method for detecting breast cancer in 
a human may be further enhanced by detecting the presence of a plurality of breast cancer- 
associated proteins and/or nucleic acids in a preselected tissue or body fluid sample. The 
detection assays may comprise one or more of the protocols described hereinbelow. 
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2.A. Protein-Based Assays 

The marker protein in a sample may be detected, for example, by combining the marker 
protein with a binding moiety capable of specifically binding the marker protein. The binding 
moiety may comprise, for example, a member of a ligand-receptor pair, i.e., a pair of molecules 



a member of a specific binding pair, such as antibody-antigen, enzyme-substrate, nucleic acid- 
nucleic acid, protein-nucleic acid, protein-protein, or other specific binding pair known in the art. 
Binding proteins may be designed which have enhanced affinity for a target protein. Optionally, 
the binding moiety may be linked with a detectable label, such as an enzymatic, fluorescent, 
radioactive, phosphorescent or colored particle label. The labeled complex may be detected, e.g., 
visually or with the aid of a spectrophotometer or other detector. 

Marker proteins may also be detected using gel electrophoresis techniques available in the 
art. In two-dimensional gel electrophoresis, the proteins are separated first in a pH gradient gel 
according to their isoelectric point. The resulting gel then is placed on a second polyacrylamide 
gel, and the proteins separated according to molecular weight (see, for example, OTarrell (1975) 
1 Biol Chem, 250: 4007-4021). 

One or more marker proteins may be detected by first isolating proteins from a sample 
obtained from an individual suspected of having breast cancer, and then separating the proteins 
by two-dimensional gel electrophoresis to produce a characteristic two-dimensional gel 
electrophoresis pattern. The pattern may then be compared with a standard gel pattem produced 
by separating, under the same or similar conditions, proteins isolated from normal or cancer cells. 
The standard gel pattem may be stored in, and retrieved from an electronic database of 
electrophoresis patterns. The presence of a breast cancer-associated protein in the two- 
dimensional gel provides an indication that the sample being tested was taken from a person with 
breast cancer. As with the other detection assays described herein, the detection of two or more 
proteins, for example, in the two-dimensional gel electrophoresis pattem fiirther enhances the 
accuracy of the assay. The presence of a plurality, e.g., two to five, breast cancer-associated 
proteins on the two-dimensional gel provides an even stronger indication of the presence of a 
breast cancer in the individual. The assay thus permits the early detection and treatment of breast 
cancer. 



capable of having a specific binding interaction. The binding moiety may comprise, for example. 
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A breast cancer-associated marker protein may also be detected using any of a wide range of 
immunoassay techniques available in the art. For example, the skilled artisan may employ the 
sandwich immunoassay format to detect breast cancer in a body fluid sample. Alternatively, the 
skilled artisan may use conventional immuno-histochemical procedures for detecting the 
presence of the breast cancer-associated protein in a tissue sample using one or more labeled 
binding proteins. 

hi a sandwich immunoassay, two antibodies capable of binding the marker protein generally 
are used, e.g., one immobilized onto a solid support, and one free in solution and labeled with a 
detectable chemical compound. Examples of chemical labels that may be used for the second 
antibody include radioisotopes, fluorescent compounds, and enzymes or other molecules that 
generate colored or electrochemically active products when exposed to a reactant or enzyme 
substrate. When a sample containing the marker protein is placed in this system, the marker 
protein binds to both the immobilized antibody and the labeled antibody, to form a "sandwich" 
immune complex on the support's surface. The complexed protein is detected by washing away 
non-boimd sample components and excess labeled antibody, and measuring the amount of 
labeled antibody complexed to protein on the support's surface. Alternatively, the antibody free 
in solution, which can be labeled with a chemical moiety, for example, a hapten, may be detected 
by a third antibody labeled with a detectable moiety which binds the free antibody or, for 
example, the hapten coupled thereto. 

Both the sandwich inmiunoassay and tissue immunohistochemical procedures are highly 
specific and very sensitive, provided that labels with good limits of detection are used. A 
detailed review of immunological assay design, theory and protocols can be found in numerous 
texts in the art, including ''Practical Immunology", Butt, W.R., ed., (1984) Marcel Dekker, New 
York and ''Antibodies, A Laboratory ApproacK\ Harlow et al eds. (1988) Cold Spring Harbor 
Laboratory. 

hi general, immunoassay design considerations include preparation of antibodies (e.g., 
monoclonal or polyclonal antibodies) having sufficiently high binding specificity for the target 
protein to form a complex that can be distinguished reliably from products of nonspecific 
interactions. As used herein, the term "antibody" is understood to mean binding proteins, for 



-16- 



ATTY. Z^Pff 



LIGATION 
ETNO. MTP-024 



example, antibodies or other proteins comprising an immunoglobulin variable region-like 
binding domain, having the appropriate binding affinities and specificities for the target protein. 
The higher the antibody binding specificity, the lower the target protein concentration that can be 
detected. As used herein, the terms "specific binding" or "binding specifically" are understood to 
mean that the binding moiety, for example, a binding protein has a binding affinity for the target 
protein of greater than about 10^ M'^, more preferably greater than about 10^ M"l. 

Antibodies to an isolated target breast cancer-associated protein which are useful in assays 
for detecting a breast cancer in an individual may be generated using standard immunological 
procedures well known and described in the art. See, for example. Practical Immunology, Butt, 
N.R., ed.. Marcel Dekker, NY, 1984. Briefly, an isolated target protein is used to raise antibodies 
in a xenogeneic host, such as a mouse, goat or other suitable mammal. The marker protein is 
combined with a suitable adjuvant capable of enhancing antibody production in the host, and is 
injected into the host, for example, by intraperitoneal administration. Any adjuvant suitable for 
stimulating the host's immune response may be used. A commonly used adjuvant is Freund's 
complete adjuvant (an emulsion comprising killed and dried microbial cells and available from, 
for example, Calbiochem Corp., San Diego, or Gibco, Grand Island, NY). Where multiple 
antigen injections are desired, the subsequent injections may comprise the antigen in combination 
with an incomplete adjuvant (e.g., cell-fi"ee emulsion). Polyclonal antibodies may be isolated 
from the antibody-producing host by extracting serum containing antibodies to the protein of 
interest. Monoclonal antibodies may be produced by isolating host cells that produce the desired 
antibody, fusing these cells with myeloma cells using standard procedures known in the 
immunology art, and screening for hybrid cells (hybridomas) that react specifically with the 
target protein and have the desired binding affinity. 

Antibody binding domains also may be produced biosynthetically and the amino acid 
sequence of the binding domain manipulated to enhance binding affinity with a preferred epitope 
on the target protein. Specific antibody methodologies are well understood and described in the 
literature. A more detailed description of their preparation can be found, for example, in 
''Practical Immunology'' (1984) {supra). 
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In addition, genetically engineered biosynthetic antibody binding sites, also known in the art 
as BABS or sFv's, may be used in the practice of the instant invention. Methods for making and 
using BABS comprising (i) non-covalently associated or disulfide bonded synthetic Vfj and Vl 

dimers, (ii) covalently linked Vj^-Vl single chain binding sites, (iii) individual Vh or Vl 

domains, or (iv) single chain antibody binding sites are disclosed, for example, in U.S. Patent 
Nos.: 5,091,513; 5,132,405; 4,704,692; and 4,946,778. Furthermore, BABS having requisite 
specificity for the breast cancer-associated proteins can be derived by phage antibody cloning 
from combinatorial gene libraries (see, for example, Clackson et al (1991) Nature 352: 624- 
628). Briefly, phage each expressing on their coat surfaces BABS having immunoglobulin 
variable regions encoded by variable region gene sequences derived from mice pre-immunized 
with isolated breast cancer-associated proteins, or fragments thereof, are screened for binding 
activity against immobilized breast cancer-associated protein. Phage which bind to the 
immobilized breast cancer-associated proteins are harvested and the gene encoding the BABS is 
sequenced. The resulting nucleic acid sequences encoding the BABS of interest then may be 
expressed in conventional expression systems to produce the BABS protein. 

The isolated breast cancer-associated protein also may be used for the development of 
diagnostic and other tissue evaluating kits and assays to monitor the level of the proteins in a 
tissue or fluid sample. For example, the kit may include antibodies or other specific binding 
proteins which bind specifically to the breast cancer-associated proteins and which permit the 
presence and/or concentration of the breast cancer-associated proteins to be detected and/or 
quantitated in a tissue or fluid sample. 

Suitable kits for detecting breast cancer-associated proteins are contemplated to include, e.g., 
a receptacle or other means for capturing a sample to be evaluated, and means for detecting the 
presence and/or quantity in the sample of one or more of the breast cancer-associated proteins 
described herein. As used herein, "means for detecting" in one embodiment includes one or 
more antibodies specific for these proteins and means for detecting the binding of the antibodies 
to these proteins by, e.g., a standard sandwich immunoassay as described herein. Where the 
presence of a protein within a cell is to be detected, e.g., as from a tissue sample, the kit also may 
comprise means for disrupting the cell structure so as to expose intracellular proteins. 
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The presence of a breast cancer in an individual also may be determined by detecting, in a 
tissue or body fluid sample, a nucleic acid molecule encoding a breast cancer-associated protein. 
Using methods well known to those of ordinary skill in the art, the breast cancer-associated 
proteins of the invention may be sequenced, and then, based on the determined sequence, 
oligonucleotide probes designed for screening a cDNA library (see, for example, Sambrook et al. 
(1989) supra), 

A target nucleic acid molecule encoding a marker breast cancer-associated protein may be 
detected using a labeled binding moiety capable of specifically binding the target nucleic acid. 
The binding moiety may comprise, for example, a protein, a nucleic acid or a peptide nucleic 
acid. Additionally, a target nucleic acid, such as an mRNA encoding a breast cancer-associated 
protein, may be detected by conducting, for example, a Northern blot analysis using labeled 
oligonucleotides, e.g., nucleic acid fragments complementary to and capable of hybridizing 
specifically with at least a portion of a target nucleic acid. 

More specifically, gene probes comprising complementary RNA or, preferably, DNA to the 
breast cancer-associated nucleotide sequences or mRNA sequences encoding breast cancer- 
associated proteins may be produced using established recombinant techniques or 
oligonucleotide synthesis. The probes hybridize with complementary nucleic acid sequences 
presented in the test specimen, and can provide exquisite specificity. A short, well-defined 
probe, coding for a single unique sequence is most precise and preferred. Larger probes are 
generally less specific. While an oligonucleotide of any length may hybridize to an mRNA 
transcript, oligonucleotides typically within the range of 8-100 nucleotides, preferably within the 
range of 15-50 nucleotides, are envisioned to be most useful in standard hybridization assays. 
Choices of probe length and sequence allow one to choose the degree of specificity desired. 
Hybridization is carried out at from 50° to 65°C in a high salt buffer solution, formamide or other 
agents to set the degree of complementarity required. Furthermore, the state of the art is such 
that probes can be manufactured to recognize essentially any DNA or RNA sequence. For 
additional particulars, see, for example. Guide to Molecular Techniques, Berger et al.. Methods 
ofEnzymology, Vol. 152, 1987. 
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A wide variety of different labels coupled to the probes or antibodies may be employed in 
the assays. The labeled reagents may be provided in solution or coupled to an insoluble support, 
depending on the design of the assay. The various conjugates may be joined covalently or 
noncovalently, directly or indirectly. When bonded covalently, the particular linkage group will 
depend upon the nature of the two moieties to be bonded. A large number of linking groups and 
methods for linking are taught in the literature. Broadly, the labels may be divided into the 
following categories: chromogens; catalyzed reactions; chemiluminescence; radioactive labels; 
and colloidal-sized colored particles. The chromogens include compounds which absorb light in 
a distinctive range so that a color may be observed, or emit light when irradiated with light of a 
particular wavelength or wavelength range, e.g., fluorescers. Both enzymatic and nonenzymatic 
catalysts may be employed. In choosing an enzyme, there will be many considerations including 
the stability of the enzyme, whether it is normally present in samples of the type for which the 
assay is designed, the nature of the substrate, and the effect if any of conjugation on the enzyme's 
properties. Potentially useful enzyme labels include oxiodoreductases, transferases, hydrolases, 
lyases, isomerases, ligases, or synthetases. Interrelated enzyme systems may also be used. A 
chemiluminescent label involves a compound that becomes electronically excited by a chemical 
reaction and may then emit light that serves as a detectable signal or donates energy to a 
fluorescent acceptor. Radioactive labels include various radioisotopes found in common use 
such as the unstable forms of hydrogen, iodine, phosphorus or the like. Colloidal-sized colored 
particles involve material such as colloidal gold that, in aggregate, form a visually detectable 
distinctive spot corresponding to the site of a substance to be detected. Additional information 
on labeling technology is disclosed, for example, in U.S. Pat. No. 4,366,241. 

A common method of in vitro labeling of nucleotide probes involves nick translation 
wherein the unlabeled DNA probe is nicked with an endonuclease to produce free 3'hydroxyl 
termini within either strand of the double-stranded fragment. Simultaneously, an exonuclease 
removes the nucleotide residue from the 5'phosphoryl side of the nick. The sequence of 
replacement nucleotides is determined by the sequence of the opposite strand of the duplex. 
Thus, if labeled nucleotides are supplied, DNA polymerase will fill in the nick with the labeled 
nucleotides. Using this well-knovm technique, up to 50% of the molecule can be labeled. For 
smaller probes, known methods involving 3 'end labeling may be used. Furthermore, there are 
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currently commercially available methods of labeling DNA with fluorescent molecules, catalysts, 
enzymes, or chemiluminescent materials. Biotin labeling kits are commercially available (Enzo 
Biochem Inc.) under the trademark Bio-Probe. This type of system permits the probe to be 
coupled to avidin which in turn is labeled with, for example, a fluorescent molecule, enzyme, 
antibody, etc. For further disclosure regarding probe construction and technology, see, for 
example, Sambrook et a/.. Molecular Cloning, A Laboratory Manual (Cold Spring Harbor, N. Y., 
1982). 

The oligonucleotide selected for hybridizing to the target nucleic acid, whether synthesized 
chemically or by recombinant DNA methodologies, is isolated and purified using standard 
techniques and then preferably labeled (e.g., with 35s or 32p) using standard labeling protocols. 
A sample containing the target nucleic acid then is run on an electrophoresis gel, the dispersed 
nucleic acids transferred to a nitrocellulose filter and the labeled oligonucleotide exposed to the 
filter under stringent hybridizing conditions, e.g., 50% formamide, 5 X SSPE, 2 X Denhardt's 

solution, 0.1% SDS at 42^0, as described in Sambrook et al (1989) supra. The filter may then 
be washed using 2 X SSPE, 0.1% SDS at 68T, and more preferably using 0.1 X SSPE, 0.1% 
SDS at 68'*C. Other useful procedures known in the art include solution hybridization, and dot 
and slot RNA hybridization. Optionally, the amount of the target nucleic acid present in a 
sample is then quantitated by measuring the radioactivity of hybridized fragments, using standard 
procedures known in the art. 

In addition, oligonucleotides also may be used to identify other sequences encoding 
members of the target protein families. The methodology also may be used to identify genetic 
sequences associated with the nucleic acid sequences encoding the proteins described herein, 
e.g., to identify non-coding sequences lying upstream or downstream of the protein coding 
sequence, and which may play a functional role in expression of these genes. Additionally, 
binding assays may be conducted to identify and detect proteins capable of a specific binding 
interaction with a nucleic acid encoding a breast cancer-associated protein, which may be 
involved, e.g., in gene regulation or gene expression of the protein. In a further embodiment, the 
assays described herein may be used to identify and detect nucleic acid molecules comprising a 
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sequence capable of recognizing and being specifically bound by a breast cancer-associated 
protein. 

In addition, it is anticipated that using a combination of appropriate oligonucleotide primers, 
i.e., more than one primer, the skilled artisan may determine the level of expression of a target 
gene in vivo by standard polymerase chain reaction (PGR) procedures, for example, by 
quantitative PGR. Conventional PGR based assays are discussed, for example, in Innes et al 
(1990) Protocols: A guide to methods and Applications'', Academic Press and Innes et al 

(1995) "PC/? Strategies"' Academic Press, San Diego, GA, 

3. Identification of Proteins Which Interact In Vivo With Breast Cancer-associated Proteins 

In addition, it is contemplated that the skilled artisan, using procedures like those 
described hereinbelow, may identify other molecules which interact in vivo vs^ith the breast 
cancer-associated proteins described herein. Such molecules also may provide possible targets 
for chemotherapy. 

By way of example, cDNA encoding proteins or peptides capable of interacting with 
breast cancer-associated proteins can be determined using a two-hybrid assay, as reported in 
Durfee et al (1993) Genes & Develop, 7: 555-559. The principle of the two hybrid system is that 
noncovalent interaction of two proteins triggers a process (transcription) in which these proteins 
normally play no direct role, because of their covalent linkage to domains that function in this 
process. For example, in the two-hybrid assay, detectable expression of a reporter gene occurs 
when two fusion proteins, one comprising a DNA-binding domain and one comprising a 
transcription initiation domain, interact. 

The skilled artisan can use a host cell that contains one or more reporter genes, such as 
yeast strain Y153, reported in Durfee et al (1993) supra. This strain carries two chromosomally 
located reporter genes whose expression is regulated by Gal4. A first reporter gene, is the E. coli 
lacZ gene under the control of the Gal4 promoter. A second reporter gene is the selectable HISS 
gene. Other useful reporter genes may include, for example, the luciferase gene, the LEU2 gene, 
and the GFP (Green Fluorescent Protein) gene. 

Two sets of plasmids are used in the two hybrid system. One set of plasmids contains 
DNA encoding a Gal4 DNA-binding domain fused in frame to DNA encoding a breast cancer- 
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associated protein. The other set of plasmids contain DNA encoding a Gal4 activation domain 
fused to portions of a human cDNA library constructed from human lymphocytes. Expression 
from the first set of plasmids results in a fusion protein comprising a Gal4 DNA-binding domain 
and a breast cancer-associated protein. Expression from the second set of plasmids produces a 
transcription activation protein ftised to an expression product from the lymphocyte cDNA 
library. When the two plasmids are transformed into a Ga/^-deficient host cell, such as the yeast 
Y153 cells described above, interaction of the Gal4 DNA binding domain and transcription 
activation domain occurs only if the breast cancer-associated protein fiised to the DNA binding 
domain binds to a protein expressed from the lymphocyte cDNA library fiised to the transcription 
activating domain. As a result of the protein-protein interaction between the breast cancer- 
associated protein and its in vivo binding partner detectable levels of reporter gene expression 
occur. 

In addition to identifying molecules which interact in vivo with the breast cancer- 
associated proteins, the skilled artisan may also screen for molecules, for example, small 
molecules which alter or inhibit specific interaction between a breast cancer-associated protein 
and its in vivo binding partner. 

For example, a host cell can be transfected with DNA encoding a suitable DNA binding 
domain/breast cancer-associated protein hybrid and a translation activation domain/putative 
breast cancer-associated protein binding partner, as disclosed above. The host cell also contains 
a suitable reporter gene in operative association with a c/^-acting transcription activation element 
that is recognized by the transcription factor DNA binding domain. The level of reporter gene 
expressed in the system is assayed. Then, the host cell is exposed to a candidate molecule and the 
level of reporter gene expression is detected. A reduction in reporter gene expression is 
indicative of the candidate's ability to interfere with complex formation or stability with respect 
to the breast cancer-associated protein and its in vivo binding partner. As a control, the candidate 
molecule's ability to interfere with other, unrelated protein-protein complexes is also tested. 
Molecules capable of specifically interfering with a breast cancer-associated protein/binding 
partner interaction, but not other protein-protein interactions, are identified as candidates for 
production and further analysis. Once a potential candidate has been identified, its efficacy in 
modulating cell cycling and cell replication can be assayed in a standard cell cycle model system. 
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Candidate molecules can be produced as described hereinbelow. For example, DNA 
encoding the candidate molecules can be inserted, using conventional techniques well described 
in the art (see, for example, Sambrook (1989) supra) into any of a variety of expression vectors 
and transfected into an appropriate host cell to produce recombinant proteins, including both full 
length and truncated forms. Useful host cells include E. coli, Saccharomyces cerevisiae, Pichia 
pastoris, the insect/baculovirus cell system, myeloma cells, and various other mammalian cells. 
The full length forms of such proteins are preferably expressed in mammalian cells, as disclosed 
herein. The nucleotide sequences also preferably include a sequence for targeting the translated 
sequence to the nucleus, using, for example, a sequence encoding the eight amino acid nucleus 
targeting sequence of the large T antigen, which is well characterized in the art. The vector can 
additionally include various sequences to promote correct expression of the recombinant protein, 
including transcription promoter and termination sequences, enhancer sequences, preferred 
ribosome binding site sequences, preferred mRNA leader sequences, preferred protein processing 
sequences, preferred signal sequences for protein secretion, and the like. The DNA sequence 
encoding the gene of interest can also be manipulated to remove potentially inhibiting sequences 
or to minimize unwanted secondary structure formation. As will be appreciated by the 
practitioner in the art, the recombinant protein can also be expressed as a fusion protein. 

After translation, the protein can be purified from the cells themselves or recovered from 
the culture medium. The DNA can also include sequences which aid in expression and/or 
purification of the recombinant protein. The DNA can be expressed directly or can be expressed 
as part of a fusion protein having a readily cleavable fusion junction. 

The DNA may also be expressed in a suitable mammalian host. Useful hosts include 
fibroblast 3T3 cells, (e.g., NIH 3T3, from CRL 1658) COS (simian kidney ATCC, CRL-1650) or 
CHO (Chinese hamster ovary) cells (e.g., CHO-DXBl 1, from Chasin (1980) Proc. Natl Acad 
Set USA 11 :42 16-4222), mink-lung epithelial cells (MVlLu), human foreskin fibroblast cells, 
human glioblastoma cells, and teratocarcinoma cells. Other useful eukaryotic cell systems 
include yeast cells, the insect/baculovirus system or myeloma cells. 

In order to express a candidate molecule, the DNA is subcloned into an insertion site of a 
suitable, commercially available vector along with suitable promoter/enhancer sequences and 3* 
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termination sequences. Useful promoter/enhancer sequence combinations include the CMV 
promoter (human cytomegalovirus (MIE) promoter) present, for example, on pCDM8, as well as 
the mammary tumor virus promoter (MMTV) boosted by the Rous sarcoma virus LTR enhancer 
sequence (e.g., from Clontech, Inc., Palo Alto). A useful inducable promoter includes, for 
example, a Zn^"*"-inducible promoter, such as the Zn^"*" metallothionein promoter (Wrana et al. 
(1992) Cell 71 : 1003-1014). Other inducible promoters are well known in the art and can be 
used with similar success. Expression also can be further enhanced using /rara-activating 
enhancer sequences. The plasmid also preferably contains an amplifiable marker, such as DHFR 
under suitable promoter control, e.g., SV40 early promoter (ATCC #37148). Transfection, cell 
culturing, gene amplification and protein expression conditions are standard conditions, well 
known in the art, such as are described, for example in Ausubel et al., ed., (1989^ '^Current 
Protocols in Molecular Biology'\ John Wiley & Sons, NY. Briefly, transfected cells are cultured 
in medium containing 5-10% dialyzed fetal calf serum (dFCS), and stably transfected high 
expression cell lines obtained by amplification and subcloning and evaluated by standard 
Western and Northern blot analysis. Southern blots also can be used to assess the state of 
integrated sequences and the extent of their copy number amplification. 

The expressed candidate protein is then purified using standard procedures. A currently 
preferred methodology uses an affinity column, such as a ligand affinity column or an antibody 
affinity column. The column then is washed, and the candidate molecules selectively eluted in a 
gradient of increasing ionic strength, changes in pH, or addition of mild detergent. It is 
appreciated that in addition to the candidate molecules which bind to the breast cancer-associated 
proteins, the breast cancer associated proteins themselves may likewise be produced using such 
recombinant DNA technologies. 

4. Breast Cancer Therapy and Methods for Monitorins Therapy 

The skilled artisan, after identification of breast cancer-associated proteins and proteins 
which interact with the breast cancer-associated proteins, can develop a variety of therapies for 
treating breast cancer. Because the marker proteins described herein are present at detectably 
higher levels in breast cancer cells relative to normal breast cells, the skilled artisan may employ, 
for example, the marker proteins and/or nucleic acids encoding the marker proteins as target 
molecules for a cancer chemotherapy. 
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A particularly useful cancer therapeutic envisioned is an oligonucleotide or peptide 
nucleic acid sequence complementary and capable of hybridizing under physiological conditions 
to part, or all, of the gene encoding the marker protein or to part, or all, of the transcript encoding 
the marker protein thereby to reduce or inhibit transcription and/or translation of the marker 
protein gene. Alternatively, the same technologies may be applied to reduce or inhibit 
transcription and/or translation of the proteins which interact with the breast cancer-associated 
proteins. 

Anti-sense oligonucleotides have been used extensively to inhibit gene expression in 
normal and abnormal cells. See, for example, Stein et aL (1988) Cancer Res, 48: 2659-2668, for 
a pertinent review of anti-sense theory and established protocols. In addition, the synthesis and 
use of peptide nucleic acids as anti-sense-based therapeutics are described in PCT publications 
PCT/EP92/01219 published November 26, 1992, PCT/US92/ 10921 published June 24, 1993, 
and PCT/US94/013523 published June 1, 1995. Accordingly, the anti-sense-based therapeutics 
may be used as part of chemotherapy, either alone or in combination with other therapies. 

Anti-sense oligonucleotide and peptide nucleic acid sequences are capable of hybridizing 
to a gene and/or mRNA transcript and, therefore, may be used to inhibit transcription and/or 
translation of the protein described herein. It is appreciated, however, that oligoribonucleotide 
sequences generally are more susceptible to enzymatic attack by ribonucleases than are 
deoxyribonucleotide sequences. Hence, oligodeoxyribonucleotides are preferred over 
oligoribonucleotides for in vivo therapeutic use. It is appreciated that the peptide nucleic acid 
sequences, unlike regular nucleic acid sequences, are not susceptible to nuclease degradation and, 
therefore, are likely to have greater longevity in vivo. Furthermore, it is appreciated that peptide 
nucleic acid sequences bind complementary single stranded DNA and RNA strands more 
strongly than corresponding DNA sequences (see, for example, PCT/EP92/20702 published 
November 26, 1992). Accordingly, peptide nucleic acid sequences are preferred for in vivo 
therapeutic use. 

Therapeutically useful anti-sense oligonucleotides or peptide nucleic acid sequences may 
be synthesized by any of the known chemical oligonucleotide and peptide nucleic acid synthesis 
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methodologies well known and thoroughly described in the art. Alternatively, a sequence 
complementary to part or all of the natural mRNA sequence may be generated using standard 
recombinant DNA technologies. 

Because the complete nucleotide sequence encoding the entire marker protein as well as 
additional 5' and 3' untranslated sequences are known for each of the marker proteins and/or can 
be determined readily using techniques well known in the art, anti-sense oligonucleotides or 
peptide nucleic acids which hybridize with any portion of the mRNA transcript or non-coding 
sequences may be prepared using conventional oligonucleotide and peptide nucleic acid synthesis 
methodologies. 

Oligonucleotides complementary to, and hybridizable wdth, any portion of the mRNA 
transcripts encoding the marker proteins are, in principle, effective for inhibiting translation of 
the target proteins as described herein. For example, as described in U.S. Pat. No. 5,098,890, 
issued March 24, 1992, oligonucleotides complementary to mRNA at or near the translation 
initiation codon site may be used to inhibit translation. Moreover, it has been suggested that 
sequences that are too distant in the 3' direction from the translation initiation site may be less 
effective in hybridizing the mRNA transcripts because of potential ribosomal "read-through", a 
phenomenon whereby the ribosome is postulated to unravel the anti-sense/sense duplex to permit 
translation of the message. 

A variety of sequence lengths of oligonucleotide or peptide nucleic acid may be used to 
hybridize to mRNA transcripts. However, very short sequences (e.g., sequences containing less 
than 8-15 nucleobases) may bind with less specificity. Moreover, for in vivo use, short 
oligonucleotide sequences may be particularly susceptible to enzymatic degradation. Peptide 
nucleic acids, as mentioned above, likely are resistant to nuclease degradation. Where 
oligonucleotide and peptide nucleic acid sequences are to be provided directly to the cells, very 
long sequences may be less effective at inhibition because of decreased uptake by the target cell. 
Accordingly, where the oligonucleotide or peptide nucleic acid is to be provided directly to target 
cells, oligonucleotide and/or peptide nucleic acid sequences containing about 8-50 nucleobases, 
and more preferably 15-30 nucleobases, are envisioned to be most advantageous. 
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An alternative means for providing anti-sense oligonucleotide sequences to a target cell is 
gene therapy where, for example, a DNA sequence, preferably as part of a vector and associated 
w^ith a promoter, is expressed constitutively inside the target cell. Oeller et aL (Oeller et aL 
(1992) Science 254: 437-539) describe the in vivo inhibition of the ACC synthase enzyme using a 
constitutively expressible DNA sequence encoding an anti-sense sequence to the full length ACC 
synthase transcript. Accordingly, where the anti-sense oligonucleotide sequences are provided to 
a target cell indirectly, for example, as part of an expressible gene sequence to be expressed 
within the cell, longer oligonucleotide sequences, including sequences complementary to 
substantially all the protein coding sequence, may be used to advantage. 

Finally, therapeutically useful oligonucleotide sequences envisioned also include not only 
native oligomers composed of naturally occurring nucleotides, but also those comprising 
modified nucleotides, for example, to improve stability and lipid solubility and thereby enhance 
cellular uptake. For example, it is known that enhanced lipid solubility and/or resistance to 
nuclease digestion results by substituting a methyl group or sulfur atom for a phosphate oxygen 
in the intemucleotide phosphodiester linkage. Phosphorothioates ("S-oligonucleotides" wherein 
a phosphate oxygen is replaced by a sulfur atom), in particular, are stable to nuclease cleavage, 
are soluble in lipids, and are preferred, particularly for direct oligonucleotide administration. S- 
oligonucleotides may be synthesized chemically using conventional synthesis methodologies 
well known and thoroughly described in the art. 

Preferred synthetic intemucleoside linkages include phosphorothioates, alkylphosphonates, 
phosphorodithioates, phosphate esters, alkylphosphonothioates, phosphoramidates, carbamates, 
carbonates, phosphate triesters, acetamidate, and carboxymethyl esters. Furthermore, one or more of 
the 5-3' phosphate group may be covalently joined to a low molecular weight (e.g., 15-500 Da) 
organic group, including, for example, lower alkyl chains or aliphatic groups (e.g., methyl, ethyl, 
propyl, butyl), substituted alkyl and aliphatic groups (e.g., aminoethyl, aminopropyl, 
aminohydroxyethyl, aminohydroxypropyl), small saccharides or glycosyl groups. Other low 
molecular weight organic modifications include additions to the intemucleoside phosphate linkages 
such as cholesteryl or diamine compounds with varying numbers of carbon residues between the 
amino groups and terminal ribose. Oligonucleotides with these linkages or with other modifications 
can be prepared using methods well known in the art (see, for example, U.S. Pat. No. 5,149,798). 
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Suitable oligonucleotide and/or peptide nucleic acid sequences which inhibit transcription 
and/or translation of the marker proteins can be identified using standard in vivo assays well 
characterized in the art. Preferably, a range of doses is used to determine effective concentrations 
for inhibition as well as specificity of hybridization. For example, in the cases of an 
oligonucleotide, a dose range of 0-100|ig oligonucleotide/ml may be assayed. Further, the 
oligonucleotides may be provided to the cells in a single transfection, or as part of a series of 
transfections. Anti-sense efficacy may be determined by assaying a change in cell proliferation 
over time following transfection, using standard cell counting methodology and/or by assaying 
for reduced expression of marker protein, e.g., by immunofluorescence. Alternatively, the ability 
of cells to take up and use thymidine is another standard means of assaying for cell division and 
may be used here, e.g., using ^H-thymidine. Effective anti-sense inhibition should inhibit cell 
division sufficiently to reduce thymidine uptake, inhibit cell proliferation, and/or reduce 
detectable levels of marker proteins. 

It is anticipated that therapeutically effective oligonucleotide or peptide nucleic acid 
concentrations may vary according to the nature and extent of the neoplasm, the particular 
nucleobase sequence used, the relative sensitivity of the neoplasm to the oligonucleotide or 
peptide nucleic acid sequence, and other factors. Useful ranges for a given cell type and 
oligonucleotide and/or peptide nucleic acid may be determined by performing standard dose 
range experiments. Dose range experiments also may be performed to assess toxicity levels for 
normal and malignant cells. It is contemplated that useful concentrations may range from about 
1 to 100 ^ig/ml per 10^ cells. 

For in vivo use, the anti-sense oligonucleotide or peptide nucleic acid sequences may be 
combined with a pharmaceutically acceptable carrier, such as a suitable liquid vehicle or 
excipient, and optionally an auxiliary additive or additives. Liquid vehicles and excipients are 
conventional and are available commercially. Illustrative thereof are distilled water, 
physiological saline, aqueous solutions of dextrose, and the like. For in vivo cancer therapies, the 
anti-sense sequences preferably can be provided directly to malignant cells, for example, by 
injection directly into the tumor. Alternatively, the oligonucleotide or peptide nucleic acid may 
be administered systemically, provided that the anti-sense sequence is associated with means for 
directing the sequences to the target malignant cells. 
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In addition to administration with conventional carriers, the anti-sense oligonucleotide or 
peptide nucleic acid sequences may be administered by a variety of specialized oligonucleotide 
delivery techniques. For example, oligonucleotides may be encapsulated in liposomes, as 
described in Mannino et al (1988) BioTechnology 6: 682, and Feigner et al (1989) Bethesda 
Res, Lab, Focus 11:21. Lipids useful in producing liposomal formulations include, without 
limitation, monoglycerides, diglycerides, sulfatides, lysolecithin, phospholipids, saponin, bile 
acids, and the like. Preparation of such liposomal formulations is vdthin the level of skill in the 
art (see, for example, in U.S. Pat. No. 4,235,871; U.S. Pat. No. 4,501,728; U.S. Pat. No. 
4,837,028; and U.S. Pat. No. 4,737,323). The pharmaceutical composition of the invention may 
further include compounds such as cyclodextrins and the like which enhance delivery of 
oligonucleotides into cells. When the composition is not administered systemically but, rather, is 
injected at the site of the target cells, cationic detergents (e.g. Lipofectin) may be added to 
enhance uptake. In addition, reconstituted virus envelopes have been successfully used to deliver 
RNA and DNA to cells (see, for example, Arad et al (1986) Biochem. Biophy, Acta, 859: 88-94). 

For therapeutic use in vivo, the anti-sense oligonucleotide and/or peptide nucleic acid 
sequences are administered to the individual in a therapeutically effective amount, for example, 
an amount sufficient to reduce or inhibit target protein expression in malignant cells. The actual 
dosage administered may take into account whether the nature of the treatment is prophylactic or 
therapeutic in nature, the age, weight, health of the patient, the route of administration, the size 
and nature of the malignancy, as well as other factors. The daily dosage may range from about 
0.01 to 1,000 mg per day. Greater or lesser amounts of oligonucleotide or peptide nucleic acid 
sequences may be administered, as required. As will be appreciated by those skilled in the 
medical art, particularly the chemotherapeutic art, appropriate dose ranges for in vivo 
administration would be routine experimentation for a clinician. As a preliminary guideline, 
effective concentrations for in vitro inhibition of the target molecule may be determined first. 

4.B. Bindine Protein-based Therapeutics. 

As mentioned above, a cancer marker protein or a protein that interacts with the cancer 
marker protein may be used as a target for chemotherapy. For example, a binding protein 
designed to bind the marker protein essentially irreversibly can be provided to the malignant 
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cells, for example, by association with a ligand specific for the cell and known to be absorbed by 
the cell. Means for targeting molecules to particular cells and cell types are well described in the 
chemotherapeutic art. 

Binding proteins may be obtained and tested using technologies well known in the art. 
For example, the binding portions of antibodies may be used to advantage. It is contemplated, 
however, that intact antibodies or BABS that have preferably been humanized may be used in the 
practice of the invention. As used herein, the term "humanized" is understood to mean a process 
whereby the framework region sequences of a non-human immunoglobulin variable region are 
replaced by corresponding human framework sequences. Accordingly, it is contemplated that 
such himianized binding proteins will elicit a weaker immune response than their unhumanized 
counterparts. Particularly useful are binding proteins identified with high affinity for the target 
protein, e.g., greater than about 10^ M"'' Alternatively, DNA encoding the binding protein may 
be provided to the target cell as part of an expressible gene to be expressed within the cell 
following the procedures used for gene therapy protocols well described in the art. See, for 
example, U.S. Patent No. 4,497,796, and "Gene Transfef\ Vijay R. Baichwal, ed., (1986). It is 
anticipated that, once bound by binding protein, the target protein will be inactivated or its 
biological activity reduced thereby inhibiting or retarding cell division. 

As described above, suitable binding proteins for in vivo use may be combined with a 
suitable pharmaceutically-acceptable carrier, such as physiological saline or other useful carriers 
well characterized in the medical art. The pharmaceutical compositions may be provided directly 
to malignant cells, for example, by direct injection, or may be provided systemically, provided 
the binding protein is associated with means for targeting the protein to target cells. Finally, 
suitable dose ranges and cell toxicity levels may be assessed using standard dose range 
experiments. Therapeutically-effective concentrations may range from about 0.01 to about 1,000 
mg per day. As described above, actual dosages administered may vary depending, for example, 
on the nature of the malignancy, the age, weight and health of the individual, as well as other 
factors. 
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4.C Small Molecule-based Therapeutics . 

After having isolated breast cancer-associated proteins, the skilled artisan can, using 
methodologies well known in the art, screen small molecule libraries (either peptide or non- 
peptide based libraries) to identify candidate molecules that reduce or inhibit the biological 
fimction of the breast cancer-associated proteins. The small molecules preferably accomplish 
this fiinction by reducing the in vivo expression of the target molecule, or by interacting with the 
target molecule thereby to inhibit either the biological activity of the target molecule or an 
interaction between the target molecule and its in vivo binding partner. 

It is contemplated that, once the candidate small molecules have been elucidated, the 
skilled artisan may enhance the efficacy of the small molecule using rational drug design 
methodologies well known in the art. Alternatively, the skilled artisan may use a variety of 
computer programs which assist the skilled artisan to develop quantitative structure activity 
relationships (QSAR) which further to assist the design of additional candidate molecules de 
novo. Once identified, the small molecules may be produced in commercial quantities and 
subjected to the appropriate safety and efficacy studies. 

It is contemplated that the screening assays may be automated thereby facilitating the 
screening of a large number of small molecules at the same time. Such automation procedures 
are within the level of skill in the art of drug screening and, therefore, are not discussed herein. 

Candidate peptide-based small molecules may be produced by expression of an 
appropriate nucleic acid sequence in a host cell or using synthetic organic chemistries. Similarly, 
non-peptidyl-based small molecules may be produced using conventional synthetic organic 
chemistries well known in the art. 

As described above, for in vivo use, the identified small molecules may be combined with 
a suitable pharmaceutically acceptable carrier, such as physiological saline or other useful 
carriers well characterized in the medical art. The pharmaceutical compositions may be provided 
directly to malignant cells, for example, by direct injection, or may be provided systemically, 
provided the binding protein is associated with means for targeting the protein to target cells. 
Finally, suitable dose ranges and cell toxicity levels may be assessed using standard dose range 
experiments. As described above, actual dosages administered may vary depending, for 
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example, on the nature of the malignancy, the age, weight and health of the individual, as well as 
other factors. 

4.D. Methods for Monitorins the Status of Breast Cancer in an Individual 

The progression of the breast cancer or the therapeutic efficacy of chemotherapy may be 
measured using procedures well known in the art. For example, the efficacy of a particular 
chemotherapeutic agent can be determined by measuring the amount of a breast cancer- 
associated protein released from breast cancer cells undergoing cell death. As reported in U.S. 
Patent Nos. 5,840,503 and 5,965,376, soluble nuclear matrix proteins and fragments thereof are 
released by cells upon cell death. Such soluble nuclear matrix proteins can be quantitated in a 
body fluid and used to monitor the degree or rate of cell death in a tissue. Similarly, the levels of 
one or more breast cancer-associated proteins could be used as an indication of the status of 
breast cancer in the individual. 

For example, the concentration of a breast cancer-associated protein or a fragment thereof 
released from cells is compared to standards from healthy, untreated tissue. Fluid samples are 
collected at discrete intervals during treatment and compared to the standard. It is contemplated 
that changes in the level of the breast cancer-associated protein, for example, will be indicative of 
the efficacy of treatment (that is, the rate of cancer cell death). It is contemplated that the release 
of soluble, breast cancer-associated proteins can be measured in blood, plasma, urine, sputum, 
vaginal secretion, and breast exudate and other body fluids. 

Where the assay is used to monitor tissue viability or progression of breast cancer, the 
step of detecting the presence and abundance of the marker protein or its transcript in samples of 
interest is repeated at intervals and these values then are compared, the changes in the detected 
concentrations reflecting changes in the status of the tissue. For example, an increase in the level 
of one or more breast cancer-associated proteins may correlate with progression of the breast 
cancer. Where the assay is used to evaluate the efficacy of a therapy, the monitoring steps occur 
foUov^ng administration of the therapeutic agent or procedure (e.g., following administration of 
a chemotherapeutic agent or foUovidng radiation treatment). Similarly, a decrease in the level of 
breast cancer-associated proteins may correlate with a regression of the breast cancer. 
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Thus, breast cancer may be identified by the presence of breast cancer-associated proteins 
as taught herein. Once identified, the breast cancer may be treated using compounds that reduce 
in vivo the expression and/or biological activity of the breast cancer-associated proteins. 
Furthermore, the methods provided herein can be used to monitor the progression and/or 
treatment of the disease. The following non-limiting examples provide details of the isolation 
and characterization of breast cancer-associated proteins and methods for their use in the 
detection of breast cancer. 

Example 1 - Identification of Breast Cancer Markers 

To identify markers for breast cancer, the sera of individuals with breast cancer were 
compared to the sera of normal individuals by surface-enhanced laser desorption and ionization 
(SELDI) mass spectrometry. Briefly, 0.5 mL aliquots of sera harvested from the individuals were 
thawed. Then, 1 j^L of a 1 mg/mL solution of soybean trypsin inhibitor (SBTI) and 1 )iL of a 1 
mg/mL solution of leupeptin were added to each aliquot. To remove lipids, 350 \\L of 1,1,2- 
trifluorotrichloroethane was added to each sample. The samples then were vortexed for five 
minutes and centrifuged in a microcentrifuge for five minutes at 4X. The resulting supematants 
were applied a 1 mL column of agarose coupled to protein G (Hitrap Protein G column, 
Pharmacia and Upjohn, Peapack, NJ) to remove immunoglobulin proteins. The column then was 
rinsed with 3 mL of 50 mM sodium phosphate, pH 7.0, with SBTI and leupeptin ("binding 
buffer"), and the resulting flov^hrough applied directly to a 5 mL column of 6% Sepharose 
coupled to Cibacron blue (Hitrap blue column, Pharmacia and Upjohn, Peapack, NJ) to remove 
albumin proteins. The Hitrap blue column was rinsed with 20 mL of binding buffer. The 
resulting flowthrough was concentrated using four centrifugation-based concentrators with a 
lOkD cutoff (Centricon 10, Millipore Corporation, Bedford, MA) to a final volume of about 0.7 
mL. 

The resulting serum (substantially free of immunoglobulin and albumin) was subdivided 
into twelve fractions containing approximately equal amoimts of protein by ion exchange 
chromatography. Specifically, the serum was applied to a Mono Q (Pharmacia and Upjohn, 
Peapack, NJ) ion exchange column (a strong anion exchanger with quartemary ammonium 
groups) in 50 mM sodium phosphate buffer, pH 7.0 and proteins were eluted from the column by 
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increasing the concentration of sodium chloride in a stepwise manner. Thus, the serum was 
divided into twelve fractions based on the concentration of sodium chloride used for elution. 
These fractions accordingly were designated flow through, 25 mM, 50 mM, 75 mM, 100 mM, 



elution, each fraction was concentrated to approximately 100 |ig/mL and buffer exchanged into 
binding buffer. 

Then 4-10 )iL from each of the twelve fractions were applied and allowed to bind to each 
of four SELDI chip surfaces, each surface holding up to eight samples. The intended location of 
each sample on the chip was demarcated with a circle drawn using a hydrophobic marker like 
those used in Pap smears. The SELDI chips used herein were purchased from Ciphergen 
Biosystems, Inc., Palo Alto, California, and used as described below. 

For copper or nickel surfaces, a chip containing ethylenediaminetriacetic acid moieties 
(IMAC, Ciphergen Biosystems, Inc., Palo Alto, CA) was pretreated with two five-minute 
applications of five |iL of a copper salt or nickel salt solution, and washed with deionized water. 
After a five-minute treatment with five ixL of binding buffer, two to three microliters of sample 
were applied to the surface for thirty to sixty minutes. Another two to three microliters of sample 
were then applied for an additional thirty to sixty minutes. The chips then were washed twice 
with binding buffer to remove unbound proteins. 0.5 |iL of sinapinic acid (12.5 mg/mL) was 
added twice and allowed to dry each time. The presence of sinapinic acid enhances the 
vaporization and ionization of the bound proteins upon mass spectrometry. 

For chip surfaces containing carboxyl moieties (WCX-2, Ciphergen Biosystems, Inc., 
Palo Alto, CA), before use of the hydrophobic pen, the surface was washed with 10 mM HCl for 
thirty minutes and rinsed five times with deionized water. After use of the pen, the surface was 
washed five times with five jaL of binding buffer and once with deionized water. Two to three 
|aL of sample were applied in two applications of thirty to sixty minutes each. The surface was 
washed twice with 5 |aL of binding buffer, and 0.5 )iL of sinapinic acid were applied twice. 

For chip surfaces containing quartemary ammonium moieties (SAX-2, Ciphergen 
Biosystems, Inc., Palo Alto, CA), after use of the pen, the surface was washed five times with 



125 mM, 150 mM, 200 mM, 250 mM, 300 mM, 400 mM, and 2M sodium chloride. After 
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five ixL of binding buffer and once with deionized water. Application of sample, washing, and 
application of sinapinic acid were done as described above. 

The chips then were subjected to mass spectrometry utilizing a Ciphergen SELDI PBS 
One (Ciphergen Biosystems, Inc., Palo Alto, CA) running the software program "SELDI v. 2.0". 
For all chips, "high mass" was set to 200,000 Daltons, "starting detector sensitivity" was set to 9 
(from a range of 1-10, with 10 being the highest sensitivity), NDF (neutral density filter) was set 
to "OUT", data acquisition method was set to "Seldi Quantitation", SELDI acquisition 
parameters were set to 20, with increments of 5, and warming with two shots at intensity 50 (out 
of 100) was included. For IMAC chips, mass was optimized from 3000 Daltons to 3001 Daltons, 
starting laser intensity was set to 80 (out of 100), and transients set to 5 (i.e., 5 laser shots per 
site). Peaks were identified automatically by the computer. For WCX-2 chips, mass was 
optimized from 3,000 Daltons to 50,000 Daltons, starting laser intensity was set to 80, and 
transients set to 8. Peaks were identified automatically by the computer. For SAX-2 chips, mass 
was optimized from 3,000 Daltons to 50,000 Daltons, starting laser intensity was set to 85, and 
transients set to 8. Peaks were identified automatically by the computer. 

Ten serum samples (five from normal individuals and five from individuals with breast 
cancer) were analyzed by mass spectrometry to identify the proteins present in the sixty fractions 
described above. The resulting peaks in the mass spectrometry trace were compared to identify 
those peaks present in the serum samples from individuals with breast cancer but not present in 
the normal samples. If peaks in different samples had a mass difference of no more than one 
percent, the peaks were presumed to be the same. Eleven mass spectrometry peaks ranging in 
size from just over 1 1,000 Da to approximately 103,000 Da were identified as present in all five 
serum samples from individuals with breast cancer and in none of the samples from normal 
individuals. The presence or absence of these peaks was then determined for an additional thirty 
serum samples (fifteen from normal individuals and fifteen from individuals with breast cancer). 
Seven other peaks that were present in four of the original five breast cancer serum samples, but 
not in any of the normal samples, were also analyzed because they were present in the same 
fraction and on the same SELDI surface as one or more of the eleven peaks already under 
evaluation. Of the eighteen peaks studied, fifteen were present in fifteen or more of the twenty 
breast cancer serum samples, but absent from 15 or more of the normal serum samples. 



-36- 



PATEi 
ATTY. 



'PLICATION 
ETNO. MTP-024 



The results of the foregoing analyses are summarized in Table 1. The masses listed in the 
table are presimied accurate to within one percent. 

TABLE 1. 
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Example 2 - Seauencine of Breast Cancer Marker Proteins 

Breast cancer-associated proteins based upon the biochemical and mass spectrometry data 
provided above may be better characterized using v^ell-known techniques. For example, samples 
of the serum can be fractionated using, for example, column chromatography and/or 
electrophoresis, to produce purified protein samples corresponding to each of the proteins 
identified in Table 1 . The sequences of the isolated proteins can then be determined using 
conventional peptide sequencing methodologies (see Examples 5 and 6). It is appreciated that 
the skilled artisan, in view of the foregoing disclosure, would be able to produce an antibody 
directed against any breast cancer-associated protein identified by the methods described herein. 
Moreover, the skilled artisan, in view of the foregoing disclosure, would be able to produce 
nucleic acid sequences that encode the fragments described above, as well as nucleic acid 
sequences complementary thereto. In addition, the skilled artisan using conventional 

'^3 recombinant DNA methodologies, for example, by screening a cDNA library with such a nucleic 

''-4 

C3 acid sequence, would be able to isolate full length nucleic acid sequences encoding target breast 
f 2 cancer-associated proteins. Such full length nucleic acid sequences, or fragments thereof, may be 
used to generate nucleic acid-based detection systems or therapeutics. 

"J 

:^ Example 3 - Production of Antibodies Which Bind Specificallv to Breast Cancer-associated 
Proteins 

C3 Once identified, a breast cancer-associated protein may be detected in a tissue or body 

u 

[3 fluid sample using numerous binding assays that are well known to those of ordinary skill in the 
art. For example, as discussed above, a breast cancer-associated protein may be detected in 
either a tissue or body fluid sample using an antibody, for example, a monoclonal antibody, 
which binds specifically to an epitope disposed upon the breast cancer-associated protein. In 
such detection systems, the antibody preferably is labeled with a detectable moiety. 

Provided below is an exemplary protocol for the production of an anti-breast cancer- 
associated monoclonal antibody. Other protocols also are envisioned. Accordingly, the 
particular method of producing antibodies to target proteins is not envisioned to be an aspect of 
the invention. 
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Balb/c by J mice (Jackson Laboratory, Bar Harbor, ME) are injected intraperitoneally 
with the target protein every 2 weeks until the immunized mice obtain the appropriate serum 
titer. Thereafter, the mice are injected with 3 consecutive intravenous boosts. Freund's complete 
adjuvant (Gibco, Grand Island) is used in the first injection, incomplete Freund's in the second 
injection; and saline is used for subsequent intravenous injections. The animal then is sacrificed 
and its spleen removed. Spleen cells (or lymph node cells) then are fiised with a mouse myeloma 
line, e.g., using the method of Kohler et al (1975) Nature 256: 495. Hybridomas producing 
antibodies that react v^th the target proteins then are cloned and grown as ascites, Hybridomas 
are screened by reactivity to the immunogen in any desirable assay. Detailed descriptions of 
screening protocols, ascites production and immunoassays also are disclosed in 
PCT/US92/09220, published May 13, 1993. 

Example 4 - Antibody-based Assay for Detectins Breast Cancer in an Individual 

The following assay has been developed for tissue samples; however, it is contemplated 
that similar assays for testing fluid samples may be developed without undue experimentation. A 
typical assay may employ a commercial immunodetection kit, for example, the ABC Elite Kit 
from Vector Laboratories, Inc. 

A biopsy sample is removed from the patient under investigation in accordance with the 
appropriate medical guidelines. The sample then is applied to a glass microscope slide and the 
sample fixed in cold acetone for 10 minutes. Then, the slide is rinsed in distilled water and 
pretreated with a hydrogen peroxide containing solution (2 mL 30% H2O2 and 30 mL cold 
methanol). The slide then is rinsed in a Buffer A comprising Tris Buffered Saline (TBS) with 
0.1% Tween and 0.1% Brij. A mouse anti-breast cancer-associated protein monoclonal antibody 
in Buffer A is added to the slide and the slide then incubated for one hour at room temperature. 
The slide then is washed with Buffer A, and a secondary antibody (ABC Elite Kit, Vector Labs, 
hic) in Buffer A is added to the slide. The slide then is incubated for 15 minutes at 37^C in a 
humidity chamber. The slides are washed again with Buffer A, and the ABC reagent (ABC Elite 
Kit, Vector Labs, Inc.) is then added to the slide for amplification of the signal. The slide is then 
incubated for a further 15 minutes at 37''C in the humidity chamber. 
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The slide then is washed in distilled water, and a diaminobenzedine (DAB) substrate 
added to the slide for 4-5 minutes. The slide then is rinsed with distilled water, counterstained 
with hematoxylin, rinsed with 95% ethanol, rinsed with 100% ethanol, and then rinsed with 
xylene. A cover slip is then applied to the slide and the result observed by light microscopy. 

Example 5 - Purification and Characterization of 28.3 kP Breast Cancer Protein 

The 28.3 kD breast cancer protein identified in Example 1 was isolated and further 
characterized as follows. 

Approximately 30 mL of serum (combined from multiple breast cancer patients) was 
depleted of immunoglobulin G and serum albumin using Protein G chromatography and 
Cibacron Blue agarose chromatography, respectively, using standard methodologies such as 
those described in Example 1 . The albumin and immimoglobulin depleted serum then was 
fractionated by Mono Q ion-exchange affinity chromatography. Briefly, the serum proteins were 
applied to a 5 mL Mono Q column (Pharmacia and Upjohn, Peapack, NJ) in 50mM sodium 
phosphate buffer, pH 7.0, and the flow through fraction collected. Thereafter, the serum proteins 
were eluted stepwise from the column using 50mM sodium phosphate buffer, pH 7.0 containing 
increasing concentrations of sodium chloride. In this manner, 12 serum fractions were obtained, 
each containing a different amount of sodium chloride. The fractions included flow through, and 
elution buffers of 50 mM sodium phosphate buffer, pH 7.0 containing 25mM, 50mM, 75mM, 
lOOmM, 125mM, 150mM, 200mM, 250mM, 300mM, 400mM, and 2M sodium chloride. 

The 50mM sodium chloride fraction containing the protein of interest was subsequently 
buffer exchanged back into 50mM sodium phosphate buffer, pH 7.0 and concentrated by means 
of a Centricon 10 (Millipore) in accordance with the manufacturer's instructions. The resulting 
sample then was fractionated by size exclusion chromatography on a Sephacryl S-200 column 
(Pharmacia) using an isocratic buffer containing lOOmM sodium phosphate, 150 mM NaCl, pH 
7.4. Fractions that eluted from the column were evaluated for the presence of the 28.3kD protein 
using the Ciphergen SELDI mass spectroscopy as described in Example 1 . Fractions containing 
the 28.3 kD protein were pooled and applied to an IMAC colimm (Sigma) which had been pre- 
loaded with Ni^"*" by prior incubation with 50mM NiCb. The IMAC column then was washed 
with 6 bed volimies of a solution containing lOOmM sodium phosphate, 150 mM NaCl, pH 7.4, 
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and the bound protein fraction eluted with the same solution containing lOOmM imidazole. The 
eluted fraction then was concentrated by means of a Minicon 10 (Millipore) and then was 
fractionated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) on a 
12% Tris glycine SDS-PAGE gel. Samples of the protein fraction were applied to two separate 
lanes of the gel. After electrophoresis, the resulting gel then was stained with Coomassie 
Brilliant Blue dye and destained to reveal the presence of proteins. Three bands of about 28.3 kD 
(characterized as the heaviest molecular weight protein, the medium molecular weight protein, 
and the lightest molecular weight protein) were excised from one of the 2 lanes and were eluted 
from the acrylamide slices. 

The proteins were eluted from the gel as follows. Briefly, the gel slices were washed five 
times with HPLC grade water with vigorous vortexing. The washed slices then were cut into 
small pieces in 120^L of lOOmM sodium acetate pH 8.5, 0.1% SDS and incubated ovemight at 
37^C. The supernatant was decanted into a fresh tube and dried in a speedvac. The resulting 
pellet then was reconstituted in 37|iiL HPLC grade water. Approximately 1480|aL of cold ethanol 
then was added and the resulting mixture incubated ovemight at -20°C. The sample was 
centrifiiged at 4°C for 15 minutes at 1 1,000 rpm. The supernatant was removed and the resulting 
pellet reconstituted in 5 |iL of water. The resulting protein solutions were run on the SELDI and 
the 28.3kD protein was identified in one of the three preparations (see Fig. 1 A which corresponds 
to the heaviest 28 kD protein). The corresponding band then was excised from the second of the 
2 lanes on the gel. After proteolysis with trypsin, the tryptic fragments were eluted from the gel 
and submitted for microsequence analysis via mass spectrometry. 

Four individual masses were detected by mass spectrometry. When the four masses were 
used to search the Swiss Protein Database, all four masses were found to match amino acid 
sequences present in the protein referred to in the art as U2 small nuclear ribonucleoprotein B" 
(U2 snRNP B") (Habets et al (1987) supra, Swiss Protein Database Accession Number 
4507123). The results are summarized in Table 2. 
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gEQvIE).!!!; 


;^fiirote]p];Jr;;i;t:v^ 


1 


QLQGFPFYGKPMR 


1 


U2 snRNP B" 


2 


HDIAFVEFENDGQAGAAR 


2 


U2 snRNP B" 


3 


LVPGRHDIAFVEFENDGQAGAAR 


3 


U2 snRNP B" 


4 


TVEQTATTTNK 


4 


U2 snRNP B" 



The amino acid sequence, in an N- to C- terminal direction, of the U2 SnRNP B" protein 
in single amino acid code is : 

MDIRPNHTIY INNMNDKIKK EELKRSLYAL FSQFGHVVDI VALKTMKMRG QAFVIFKELG 
SSTNALRQLQ GFPFYGKPMR IQYAKTDSDI ISKMRGTFAD KEKKKEKKKA KTVEQTATTT 
NKKPGQGTPN SANTQGNSTP NPQVPDYPPN YILFLNNLPE ETNEMMLSML FNQFPGFKEV 
RLVPGRHDIA FVEFENDGQA GAARDALQGF KITPSHAMKI TYAKK (SEQ ID NO: 5) 

Example 6 - Purification and Characterization of 71 kP Breast Cancer Protein 

The 71 kD breast cancer protein identified in Example 1 was isolated and further 
characterized as follows. 

50 mL of serum from each of four individuals was pooled to give a single aliquot of 200 
mL. This 200 mL aliquot was subdivided into six aliquots of 33 mL each. Each aliquot was 
treated with 19 mL of trifluorotrichloroethane as described in Example 1 . Each aliquot was 
applied to Protein G and Cibacron Blue columns as described in Example 1 . Fractions 
containing protein in the flov^hrough (approximately 500 mL/aliquot) were pooled and 
concentrated to approximately 10 mL/aliquot (60 mL total) using Centricon concentrators. 

3 mL aliquots were loaded onto 5 mL mono Q sepharose columns (60 mL/ 3mL = 20 
aliquots). Fractionation was performed as described in Example 1 , except that all volumes were 
multiplied by 5. The fractions eluted with 100 mM sodium chloride from each fractionation 
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were pooled into a single 200 mL fraction and buffer exchanged into binding buffer as described 
in Example 1 . 

The 200 mL fraction was applied to a series of antibody columns to remove abundant 
proteins of 50-70 kD. Each of these proteins, alpha- 1 anti-trypsin, ceruloplasmin, kallikrein, and 
GC-globulin, had been identified and sequenced during preliminary attempts to isolate the 71 kD 
protein. Commercial antibodies to each of the proteins were purchased and coupled to a solid 
support (agarose) using conventional NHS ester chemistry (Pierce Aminolink Plus kit — ^part 
number 44894). The 200 mL fraction was applied to each column in turn until the protein in 
question could no longer be seen in the flowthrough by Western blot analysis. 

The flowthrough was subjected to size exclusion chromatography using an S200 column. 
Fractions containing the 71 kD peak were identified by SELDI as described in Example 1. 
Because these fractions also appeared to contain a fragment of human serum albumin (HSA) that 
would not bind to the Cibacron blue column, the fractions were applied to an HSA affinity 
column with two murine antibodies to HSA to depelete the remaining HSA from the sample. 
SDS-PAGE analysis of the sample revealed a single band in the 71 kD range by silver staining. 
The remaining sample was divided into two aliquots and run on two lanes of a 10% tris-glycine 
gel. The gel was stained with Coomassie Brilliant Blue dye. The 71 kD band from one of the 
two lanes was excised and eluted from the gel as described in Example 5. Its identity as the 
70.972 kD marker protein was confirmed by SELDI. The 71 kD band from the other lane was 
excised and treated with trypsin. The resulting peptides were eluted from the gel and subjected 
to microsequence analysis by mass spectrometry. Sixteen of the predicted trypsin fragments of 
the 64-kD subunit of cleavage stimulation factor have masses corresponding to those identified 
in the mass spectrum of the 71 kD protein. The sixteen sequences are set forth in Table 3. Two 
reported sequences for cleavage stimulation factor are set forth in the Sequence Listing as SEQ 
ID NO:22 and SEQ ID NO:23. 
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TABLE 3. 





Pet)tid&? 


.Sequence^-'- .. . Z- , ■ '--.rv: 


SEQ IP : . ; 


■'Protein- .i's;, ,,::,■.'.[■ ^iii: '^■■••fe •- 




1 


GQVPMQDPR 


6 


Cleavage Stimulation Factor 




2 


GSLPANVPTPR 


7 


Cleavage Stimulation Factor 




3 


GLLGDAPNDPR 


8 


Cleavage Stimulation Factor 




4 


AGLTVRDPAVDR 


9 


Cleavage Stimulation Factor 




5 


ALRVDNAASEKNK 


10 


Cleavage Stimulation Factor 




6 


GGTLLSVTGEVEPR 


11 


Cleavage Stimulation Factor 


u 


7 


DIFSEVGPVVSFR 


12 


Cleavage Stimulation Factor 


H 


8 


GIDARGMEARAMEAR 


13 


Cleavage Stimulation Factor 


q 


9 


GMEARAMEARGLDAR 


14 


Cleavage Stimulation Factor 


\^ 
K\ 


10 


AVASLPPEQMFELMK 


15 


Cleavage Stimulation Factor 


5 

i 3 


11 


AMEARAMEVRGMEAR 


16 


Cleavage Stimulation Factor 


M 

r 


12 


GYLGPPHQGPPMHHVPGHESR 


17 


Cleavage Stimulation Factor 


C3 

r 


13 


GPIPSGMQGPSPINMGAVVPQGSR 


18 


Cleavage Stimulation Factor 


ri 


14 


NMLLQNPQLAYALLQAQVVMR 


19 


Cleavage Stimulation Factor 




15 


GGPLPEPRPLMAEPRGPMLDQR 


20 


Cleavage Stimulation Factor 




16 


SLGTGAPVIESPYGETISPEDAPESISK 


21 


Cleavage Stimulation Factor 
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Equivalents 

The invention may be embodied in other specific forms without departing from the spirit 
or essential characteristics thereof The foregoing embodiments are therefore to be considered in 
all respects illustrative rather than limiting on the invention described herein. Scope of the 
invention is thus indicated by the appended claims rather than by the foregoing description, and 
all changes that come within the meaning and range of equivalency of the claims are intended to 
be embraced by reference therein. 

Incorporation By Reference 
The entire disclosure of each of the aforementioned patent and scientific documents cited 

hereinabove is expressly incorporated by reference herein. 
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